Kafka-ML: Connecting the data stream with ML/AI frameworks
نویسندگان
چکیده
Machine Learning (ML) and Artificial Intelligence (AI) depend on data sources to train, improve, make predictions through their algorithms. With the digital revolution current paradigms like Internet of Things, this information is turning from static continuous streams. However, most ML/AI frameworks used nowadays are not fully prepared for revolution. In paper, we propose Kafka-ML, a novel open-source framework that enables management pipelines Kafka-ML provides an accessible user-friendly Web user interface where users can easily define ML models, then evaluate, deploy them inferences. itself components it deploys managed containerization technologies, which ensure portability, easy distribution, other features such as fault-tolerance high availability. Finally, approach has been introduced manage reuse streams, may eliminate need storage or file systems.
منابع مشابه
Connecting architecture reconstruction frameworks
A number of standalone tools are designed to help developers understand software systems. These tools operate at different levels of abstraction, from low level source code to software architectures. Although recent proposals have suggested how code-level frameworks can share information, little attention has been given to the problem of connecting software architecture level frameworks. In thi...
متن کاملConnecting Pervasive Frameworks Through Mediation
Context information helps an application decide on what to do in order to adapt to its user’s needs. To easily develop ubiquitous applications, there has been increased research in the design and development of frameworks called pervasive computing frameworks. Although these frameworks help application developers create ubiquitous applications easily, interoperability has been a problem because...
متن کاملKafka, Samza and the Unix Philosophy of Distributed Data
Apache Kafka is a scalable message broker, and Apache Samza is a stream processing framework built upon Kafka. They are widely used as infrastructure for implementing personalized online services and real-time predictive analytics. Besides providing high throughput and low latency, Kafka and Samza are designed with operational robustness and long-term maintenance of applications in mind. In thi...
متن کاملApache Spark and Apache Kafka at the Rescue of Distributed RDF Stream Processing Engines
Due to the growing need to timely process and derive valuable information and knowledge from data produced in the Semantic Web, RDF stream processing (RSP) has emerged as an important research domain. In this paper, we describe the design of an RSP engine that is built upon state of the art Big Data frameworks, namely Apache Kafka and Apache Spark. Together, they support the implementation of a...
متن کاملHeterogeneity-aware scheduler for stream processing frameworks
This article discusses problems and decisions related to scheduling of stream processing applications in heterogeneous clusters. An overview of the current state of the art of the stream processing on heterogeneous clusters with a focus on resource allocation and scheduling is presented first. Then, common scheduling approaches of various stream processing frameworks are discussed and their lim...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Future Generation Computer Systems
سال: 2022
ISSN: ['0167-739X', '1872-7115']
DOI: https://doi.org/10.1016/j.future.2021.07.037